Correcting Semantic Collocation Errors with L1-induced Paraphrases

نویسندگان

  • Daniel Dahlmeier
  • Hwee Tou Ng
چکیده

We present a novel approach for automatic collocation error correction in learner English which is based on paraphrases extracted from parallel corpora. Our key assumption is that collocation errors are often caused by semantic similarity in the first language (L1language) of the writer. An analysis of a large corpus of annotated learner English confirms this assumption. We evaluate our approach on real-world learner data and show that L1-induced paraphrases outperform traditional approaches based on edit distance, homophones, and WordNet synonyms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation on Second Language Collocational Congruency with Computational Semantic Similarity

Collocation learning is one of the important building blocks for the development of language competence. Remarkably, it is influenced by L1 and L2 congruency. The present study thus focused on the distinguishability of the computational similarity values between L2 collocates and L1 counterparts to establish the use of semantic similarity measure as a research instrument. The results showed tha...

متن کامل

L1 Transfer in L2 Acquisition of the There-Insertion Construction by Mandarin EFL Learners

This study examined the role of the native language (L1) transfer in a non-native language (L2) acquisition of the there-insertion construction at the syntax-semantics interface. Specifically, the study investigated if Mandarin EFL learners would make overgeneralization errors in the situation where an L1 argument structure constitutes a superset of its L2 counterpart. Verbs of existence and ap...

متن کامل

Automatic Generation of Syntactically Well-formed and Semantically Appropriate Paraphrases

Paraphrases of an expression are alternative linguistic expressions conveying the same information as the original. Technology for handling paraphrases has been attracting increasing attention due to its potential in a wide range of natural language processing applications; e.g., machine translation, information retrieval, question answering, summarization, authoring and revision support, and r...

متن کامل

Using Paraphrases of Deep Semantic Representions to Support Regression Testing in Spoken Dialogue Systems

Rule-based spoken dialogue systems require a good regression testing framework if they are to be maintainable. We argue that there is a tension between two extreme positions when constructing the database of test examples. On the one hand, if the examples consist of input/output tuples representing many levels of internal processing, they are finegrained enough to catch most processing errors, ...

متن کامل

Compounds and Productivity in Advanced L2 German Writing: A Constructional Approach

The frequent formation of complex, hierarchically structured compounds is a striking property of German grammar to non-natives. This article asks how compounding works in second language (L2) German grammar, by exploring data from the error-annotated Falko corpus of native and advanced non-native German writing. Beyond differences in overall frequency and productivity of L2 compounding, I use a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011